A  D - 7  6 9  370 


THE  GENERATION  OF  FRENCH  FROM  A 
SEMANTIC  REPRESENTATION 


Annette  Herskovits 


Stanford  University 


Prepared  for: 

Advanced  Research  Projects  Agency 
August  197  3 


DISTRIBUTED  BY: 


National  Technical  Information  Service 
U.  S.  DEPARTMENT  OF  COMMERCE 

5285  Port  Royal  Road,  Springfield  Va.  22151 


DISCLAIMER  NOTICE 


THIS  DOCUMENT  IS  THE  BEST 
QUALITY  AVAILABLE. 

COPY  FURNISHED  CONTAINED 
A  SIGNIFICANT  NUMBER  OF 
PAGES  WHICH  DO  NOT 
REPRODUCE  LEGIBLY. 


_ Uncl>-  3 i lied  _ _ 

Sj'curm  .  lassification 


DOCUMENT  CONTROL  PAT A  -  R  &  D 

iS«*rum>-  vims siliemhon  ot  title,  body  ot  abstract  #i‘d  indexing  mnnoiat  zn  mast  be  entered  when  ti:e  overall  report  It  classified) 


y  OH^iNA  riNG  activity  {Corporate  author)  2*.  REFORT  5ECUR.TY  CL*SSinc4TiON 

Stanford  University  Unclassified 

Computer  Science  Department  i*  c*ou»  " 

Stanford,  California,  9^305  Blank 


»  RF.OORT  TITLE 

Th^  generation  of  French  from  a  semantic  representation 


4  OCSCRiRtivC  notei  ('  pe  ot  report  end  inclusive  dates) 

technical  report,  August  1973 


5  autmoriJI  (First  name,  middle  initial,  lest  name) 

Annette  T^rskovits 


6  RCPOR T  DATE 


August  1973 


AO.  contract  OR  grant  no 


SD  lb3 

6  PNOJEC  T  NO 
c.  ^57 


1».  TOTAL  NO  O F  PACES  7b.  NO.  O F  BETS 

^^,3  3 


V*  ORIGINATOR'S  REPORT  NUMfi£R(S) 

STAN-CS  73-o8ii 


tto.  OTHER  REPORT  NOiSI  (An/  other  nbmbtrt  that  m&v  be  assigned 
this  report) 

AIM  -  °12 


10  OISTRIBUTION  STATEMENT 


Releasable  without  limitations  on  dissemination 


n  supplementary  notes 


12  SPONSORING  M.  LI  T  ARY  ACTIVITY 


Blank 


Blank 


13  A  B  $  T  R  A  C 


The  report  contains  first  a  brief  description  of  Preference  Semantics,  a  system  of 
representation  and  analysis  of  the  meaning  structure  of  natural  language.  The  analysis 
algorithm  which  transforms  phrases  into  semantic  items  called  templates  has  been  con¬ 
sidered  in  detail  elsewhere,  so  this  report  concentrates  on  the  second  phase  of  analy¬ 
sis,  which  binds  templates  together  into  a  higher  level  semantic  block  corresponding  to 
an  English  paragraph,  and  which,  in  operation,  interlocks  with  the  French  generation 
procedure.  During  this  phase,  the  semantic  relations  between  templates  are  extracted, 
pronouns  are  referred  and  those  word  disambiguations  are  done  that  require  the  context 
of  a  whole  paragraph.  These  tasks  require  items  called  PAFAPLATES  which  are  attached 
to  keywords  such  as  prepositions,  subjunctions  and  relative  pronouns.  The  system 
'■'nooses  the  representation  which  me  ises  a  carefully  defined  "semantic  density." 

A  system  for  the  generation  of  French  sentences  is  then  described,  based  on  the 
recursive  evaluation  of  procedural  generation  patterns  called  oTEREOIYPES.  The  stereo¬ 
types  are  semantically  context  sensitive,  arc  attached  to  each  sense  of  English  words 
and  keywords  and  are  carr-;  i  into  the  representation  by  the  analysis  procedure.  The 
representation  of  the  meaning  of  words,  and  the  versatility  of  the  stereotype  format, 
allow  for  fine  mea.ning  distinctions  to  appear  in  the  French,  and  for  the  construction 
of  French  differing  radically  from  the  English  original. 


by 

national  technical 
INFORM*. T,CN  SERVICE 

IJ  S  Oeparl'T'fio*  *>♦  Commerce 
V  A  JJI5I 


,1473 


(PAGE  I) 


S/ri  0101. 607.6001 


Security  Classification 


STANFORD  ARTIFICIAL  INTELLIGENCE  LABORATORY  AUGUST  1973 

NENQ  NO.  AIN-212 

CONPUTEh  SCIENCE  DEPARTMENT 
REFORT  NO.  CS-38<« 

THE  GENERATION  OF  FRENCH  FROfl  A  SEMANTIC  REPRESENTATION 


by 

ANNETTE  HERSKQVITS 


ABSTRACT:  The  report  contains  fust  a  brief  description  of  Preference 
Semantics,  a  system  of  representation  and  analysis  of  the  meaning 
structure  of  natural  language.  The  analysis  algorithm  which 
transforms  phrases  m*o  semantic  items  called  templates  has  been 
considered  in  detai1  elsewhere,  so  this  report  concentrates  on  the 
serene!  phase  of  analysis,  which  binds  templates  together  into  a 
higher  level  semantic  block  corresponding  to  an  English  paragraph, 
and  which,  in  operation,  Interlocks  with  the  French  gener a  t  i  on 
procedure.  During  this  phase,  the  semantic  relations  between 
templates  are  extracted,  pronouns  are  referred  and  those  word 
disambiguations  are  done  that  require  the  context  of  a  whole 
paragraph.  these  tasks  require  items  called  PARAPLATES  uhich  are 
attached  to  keywords  such  as  prepositions,  subjunctions  and  relative 
pronouns.  The  system  chooses  the  representation  which  maximises  a 
carefully  defined  "semantic  density". 

A  system  for  the  generation  of  French  sentences  is  then  described, 
based  on  the  recursive  evaluation  of  procedural  generation  patterns 
called  STEREOTYPES,  The  stereotypes  are  semantically  context 
sensitive,  are  attached  to  each  seise  of  English  words  and  keywords 
and  are  carried  into  the  representation  by  the  analysis  procedure. 
The  representation  of  the  meaning  of  words,  and  the  versatility  of 
the  stereotype  format,  a i I ow  for  fine  meaning  distinctions  to  appear 
in  the  French,  and  for  the  construction  of  French  differing  radically 
from  the  English  original. 

The  views  and  conclusions  contained  in  this  document  are  those  of  the 
author  and  should  not  be  interpreted  as  representing  necessarily  the 
official  policies,  either  expressed  or  implied,  of  the  Advanced 
Research  Projects  Agency  or  the  U.  5.  Government. 

This  research  was  supported  by  the  Advanced  Research  Projects  Agency, 
Department  of  Get  er.se  ISO  1  S3 )  ,  USA. 

^Reproduced  in  the  USA.  Available  from  the  National  Technical 
Information  Service.  Springfield,  Virginia,  221A>1. 


CHAPTER  1 


INTRODUCTION. 


This  paper  describes  the  genera  t  ion  of  French  sentences  from  a 
semantic  representation  of  natural  language  conceived  by  Yorick  Hi  Iks 
Ill.  The  generation  procedure  is  part  of  a  system  which  takes  as 
input  English  paragraphs,  transforms  them  into  an  Interlingual 
representation  ( I R )  and  outputs  a  French  translation.  The  system, 
called  Preference  Semantics,  differs  from  former  earlier  attempts  to 
do  machine  translation  (NT),  in  that  it  involves  no  explicit 
syntactical  analysis,  but  uses  instead  semantic  means  at  every  level 
of  analysis  and  generation.  In  fact,  the  system  can  be  said  to 
"under stand"  the  text  translated. 

Preference  Semantics  is  characterized  by: 

1)  lexical  decomposition.  E3ch  sense  of  a  word  of  the  source 
language  is  coded  by  a  tree  of  semantic  markers  or  elements  from  a 
finite  set  of  fundamental  concepts.  This  structure  is  called  a 
"semantic  formula". 

2)  it  involves  a  catalogue  of  case  relationships  ,  such  as: 
actor  *•  action,  event  **  location.  Their  occurence  in  a  text  is  made 
explicit:  thus,  an  English  sentence  is  transformed  into  a  network  of 
lexical  decomposition  tree,  where  the  arcs  represent  case 
relationships  . 

3)  the  network  is  organised  on  two  levels:  at  the  lower 
level  are  templates  correspond i nci  to  fragments  of  English  (what 
constitutes  a  fragment  will  be  made  precise  later  but  correponds  to 
the  concept  of  a  phrase).  The  templates  in  turn  are  organised  into 
a  higher  level  network.  The  analysis  routines  proceed  in  two  stages 
corresponding  to  these  two  levels  of  organisation.  First  the  text  is 
fragmented  and  the  semantic  analysis  carried  out  within  the  context 
of  a  fragment.  Then,  a  second  stage  deals  with  semantic  relations 
between  fragments,  including  the  referral  of  pronouns. 

4)  At  each  stage,  the  system  directs  itself  toward  the 
correct  repesentat i on  by  preferring  the  most  " seman t i c a  I  I y  dense" 
one:  that  is.  as  a  somewhat  crude  approximation,  the  one  such  that 
the  redundancy  among  the  lexical  decomposition  trees  is  largest. 

He  feel  that  lexical  decomposition  together  with  this  method  of 
selection  of  the  right  meaning  for  a  sentence  constitute  a  reasonable 
formalization  of  the  representation  humans  maintain  in  their  memory 
and  of  the  process  they  carry  out  uhen  they  understand  language. 
Introspective  observation  brings  intuitive  support  to  the  fact  that, 
whatever  complex  mental  object  is  associated  with  a  given  word  sense, 


l 


understanding  a  sentence  involves  -intersecting1'  those 
representat i ons.  Thus  if  tie  say  I  hear  a  bark  ,  the  r'Clht 
interpretation  arises  because  the  mental  objects  associated 
respectively  with  -hear"  and  with  "bark"  as  an  animal  cry,  interact 
extensively,  whereas  tree  coverings  and  sounds  cannot  be  connected 
in  an  immediate  way.  Ue  are  cr  ivinced  that  such  semantic 
connections  are  used  tc  establish  the  meaning  of  an  utterance  prior 
to  any  grammatical  analysis. 

Clearly  the  mental  imagy  associated  with  a  word  is  a  very  complex 
memory  item  invoivina  sensory  as  well  as  symbolic  elements.  But  a 
network  of  fundamental  concepts  seems  a  reasonably  good  map  for  it, 
in  terms  of  the  "understanding"  performance  which  an  algorithm 
working  on  a  "maximum  intersection"  principle  can  achieve  with  it,  as 
we  will  see. 

Lex  i  cm  d  'comnns  i  t  i  on  is  one  form  of  a  data  base  of  knowledge  about 
the  world  and  some  general  inference  making  mechanism  could  plausibly 
do  \i.c  work  of  the  Preference  Semantics  method  of  meaning  selection. 
However  the  major  part  of  understanding  rei  es  on  intelligent  use  of 
semantic  information  which  can  be  matte  available  in  adequate  lexica' 
decomposition.  This  recommends  that  this  information  be  coded  in  the 
most  economical  way.  that  it  be  readily  accessible  without  time 
consuming  search.  Preference  Semantics  seems  a  most  natural  and 
effective  nay  of  meeting  these  requirements. 

However  there  are  some  cases  when  a  correct  English-French 
translation  requires  the  knowlege  of  facts  not  naturally  expressed  in 
lexical  decomposition,  and  a  way  of  inferring  from  the  text  and  from 
this  store  of  knowledge.  Here  is  an  example: 

The  soldiers  fired  at  the  women;  1  saw  them  stagger  and  fall. 

Referring  the  pronoun  "them"  in  the  second  sentence  would  require 
some  equivalent  of  the  following  "reasoning":  firing  at  someone 
usually  wounds  him;  wounded  people  often  loose  balance  and  stagger; 
thus  "them"  refers  to  "the  women".  The  first  fact  would  logically 
nopear  in  tbn  lexical  decemporiton  cf  "fire  at";  i,e.  the  purpose  of 
..flrincj  at"  is  usually  to  hurt.  But  'he  rest  involves  knowledge  that 
cauid  not  be  reasonably  cod  d  within  the  semantic  formulas  of  the 
words  occurring  in  the  sentence. 

Thus  we  arn  in  the  process  of  adding  to  the  system  a  component  ca'lnd 
r n hi m o n  Sense  inferences,  which  is  conceived  as  a  natural  extension  of 
the  existing  Preference  Ssiantics  system,  inthat  it  uses  the  same 
formalism  and  preference  print  pie  (i4il*.s  t<-}  )  • 

Two  other  problems  invoh'Jd  in  correctly  translating  English  into 
French  require  machinery  'f  another  kind. 


2 


1)  Consider  "!  drink  nine"  and  "  I  like  wine".  In  the  first 
case  ue  liavp  in  i-rench  "du  vin"  and  in  the  second  "  I  e  vin  (a  finite 
quantity  of  wine  versus  wine  as  a  substance). 

2)  "I  went  for  a  walk  thic  worn i i g"  and  "I  went  for  a  walk 
every  morning"  give  respectively:  "Je  me  suis  promenee  ce  matin"  and 
"Je  me  promenais  tcus  les  matins".  The  imperfect  is  used  in  French 
for  a  repetitive  action  and  the  past  for  a  one-time  action. 

Although  in  principle,  questions  such  as  "are  ue  concerned  here  uith 
wine  as  a  species"  or  "is  this  action  habitual"  could  be  answered  by 
using  the  inference  mechanism,  they  are  too  complex  to  be  dealt  with 
in  ♦hie  way  in  practice.  Thus  we  will  implement  special  semantic 
procedures  which  will  use  the  semantic  representation  together  with 
some  heuristics  to  answer  these  specific  questions. 

Correct  translation  from  one  language  into  another  is  one  test  of 
"understanding"  for  a  computer  system.  Questions  about  whether 
systems  capable  of  carrying  out  an  "intelligent"  conversation  exhibit 
more  "understanding"  are  meaningless  without  first  defining  i  1  some 
precise  way  the  class  of  questions  which  they  are  able  to  answer. 
This  being  rather  a  difficult  problem,  we  will  simply  note  that  with 
the  Inference  component,  which  is  at  this  time  a'ready  precisely 
defined  and  being  programmed,  nothing  significant  has  to  be  added  to 
the  system  to  extend  it  into  a  question  answering  system  uhich  will 
answer  a  non  trivial  class  of  questions.  It  remains  to  defins  "non 
trivial"  more  precisely  and  to  compare  the  performance  cf  such  a 
system  with  other  question  answering  systems. 

14  i  !  k  s  has  described  in  detail  the  semantic  representation  and  the 
first  stage  of  the  analysis  (Uilks  [11);  ue  will  thus  present  here 
only  a  brief  description  of  both  uith  particular  attention  to  aspects 
relevant  to  the  generation  procedure.  Ue  will  then  describe  in  detail 
the  second  stage  of  analysis  {  i.e.  the  i nter fragment  analysis  )  and 
the  French  generation  routines  as  they  are  both  conceptually  and 
pr ogrammat i ca ' I y  intertwined. 


3 


CHAPTER  II 


THE  INTERLINGUA  AND  INTERNAL  ANALYSIS  CF  FRAGMENTS. 


Ne  will  first  describe  the  interlingual  building  blocxs  or  ELCT  "NTS, 
then  *ach  significant  substructure  of  the  Inter) ingua  together  ;iith 
the  various  procedures  which  constitute  the  intrafragment  analysic. 
Ue  nil  I  describe  the  final  outlook  of  the  IR,  but  will  cot. rider  the 
n f e- 'ragment  analysis  only  in  the  next  chapter. 

ELEMENTS 

They  are  G3  semantic  primitives  corresnond i ng  to  fundamental  concepts 
and  relations.  Here  are  some  examples  Un  capital  let'ers)  followed 
by  a  discursive  description: 

(a)  ent i ties: 

MAN  (human  being'*,  STUFF  (substances).  THING  (physical 
abject)  etc... 

(b)  act i cns: 

HAVE  fpossesses),  FORCE  (compels),  CAUSE  (causes  to  happen) 

etc.  . . 

(c)  t  ype  indicators: 

KINO  (being  a  quality),  HOU  (being  a  type  of  action)  etc,. 

(d)  sor  t  s; 

UhOLE  (being  a  totality),  GOOD  (being  morally  acceptable), 
THRU  (being  an  aperture)  etc... 

(e)  cases: 

AT  (location),  WITH  (instrument)  ,  SUBJ  (agent),  OBJE 
(patient  of  action),  II.  (containment),  POSS  (possessed  by)  etc... 


FORMULAS 

A  semantic  formula  is  a  bina>  j  tree  structure  of  ELEMENTS,  expressing 
the  semantic  contenv  of  a  concept.  In  our  dictionary,  each  sense  of 
an  English  word  is  cotied  with  such  a  formula.  For  example: 

(  (*ANI  SUBJMUAN!  OBJE)  (((LIFE  OBJE)  NOTHAVE)  CAUSE))) 

represents  the  meaning  of  ’to  k i 1  I". 


4 


At  any  fork  of  the  binary  tree,  there  is  a  dependency  relation  of  the 
left  branch  upon  the  right  branch.  This  dependency  is 
interpreted  differently  but  unambiguously,  according  to  the  left  and 
right  subtrees:  for  example  to  the  left  of  CAl)£E,  we  expect  to  find  a 
subformula  referring  to  what  has  been  caused.  The  subformula  with 
OBJE  os  a  right  member  indicates  the  class  of  preferred  objects  of 
the  action,  here  *ANI  or  class  of  animate  beings.  Similarly,  (*ANI 
SUBJ)  indicates  that  the  subject  of  "to  kill”  is  generally  an  animate 
being.  Thus  the  whole  formuia  says  that  "to  kill"  is  "  an  animate 
beiny  causing  an  animate  being  to  lc..e  life", 

A  consequence  of  the  left  to  right  dependency  rule  is  that  t  >e 
rightmost  element  of  a  formula,  vhe  HEAD,  is  the  primitive  whose 
semantic  scope  comprehends  most  adequately  that  of  the  concept 
described  by  the  formula.  The-  choice  o»’  a  head  for  a  given  concept  is 
sometimes  debatable. 

For  example,  on*  sense  of  "to  urge"  has  been  coded: 

f  U1AN  SU8J) ( (*ANI  OBJE) (FORCE  TELL))) 

The  head  is  TEt_L,  which  means  "to  commjnicate  verbally";  i trying  to 
define  "to  urge,  "  this  might  be  the  fi^st  ue ' i m i tf t iu n  of  the 
meaning  we  would  like  \u  do,  given  the  choice  of  prim  five?  that  is 
avai  I  able  +o  us.  However  we  might  prefer  FORCE  as  a  head,  with  the 
r  ’’  gh  tmos  t  subformula  ITELL  FGRCE),  thinking  of  "to  urge"  as  "to 
encourage  verbally"  rather  than  "to  utter  encouragements".  The 
decision  is  largely  dependent,  as  is  the  whole  coding  and  even  the 
basic  dicrimination  of  word  senses,  jn  the  cask  which  we  set 
ourselves  with  the  interlingual  •"epresentat  ion.  He  will  come  back  on 
this  point  later,  when  speaking  specifically  nf  problems  of 
translation  into  French. 

More  details  about  the  syntax  and  semantics  of  formulas  is  available 
in  U i  I  k  s  [  1 ) . 

BARE  TEMPLATES 

A  bare  template  is  an  ordered  triple  of  elements,  whose  semantic 
interdependence  is  that  of  an  agen t -act-ob jec t  triple.  Our  inventory 
of  bare  .templates  should  contain  all  and  only  those  trioles  which  can 
be  built  as  follows:  bu  al  gning  the  heads  cf  the  formulas  of  the 
agent,  action  and  object  of  any  natural  language  statement  which  does 
not  involve  nonsense  or  metaphors.  Thus  MAN_C I V£_THI NG  is  a  bare 
template,  but  not  MAN-EE-TH1NG  (the  semantic  scope  o?  the  elements 
GIVE  and  BE  should  be  obvious).  Presumably,  no  statement  respecting 
the  above  restrictions  would  have  for  core  of.  meaning  "  i  man  is  a 
physical  object";  but  "John  offered  a  motocycle  to  his  sor"  yields 
the  bare  template  MAN-GI VE-TH'NG.  The  significance  of  bare 
templates  lies  in  the  way  in  which  they  function  in  the  analysis 
algorithm,  which  we  will  now  sketch. 


3 


FRAGMENTATION 


The  original  English  text  is  first  fragmented:  at  punctuation  marks; 
keywords  such  33  sub junct i ons,  prepositions,  connectives  and  relative 
pronouns;  before  gerunds  and  where  “that"  has  been  omitted. 


BARE  TEMPLATE  MATCHING 

As  we  have  seen,  to  each  English  word  in  our  dictionary  s  attached 
one  or  more  formulas  corresponding  to  the  various  arises  of  the  word, 
forking  within  the  context  of  single  fragment,  we  form  all  sequences 
of-  formulas  which  can  be  obtained,  by  picking  for  each  uord  of  the 
fragment,  one  of  its  formulas.  The  corresponding  sequence  of  heads  is 
then  examined;  if  three  heads,  not  necessarily  consecutive  but  in  the 
order  of  the  corr eispondi r.g  text,  make  a  triple  which  is  in  the  bare 
bare  template  inventory,  then  we  keep  the  corresponding  sequence  of 
formulas  for  further  examination;  otherwise,  this  “interpretation"  of 
‘the  fragment  is  eliminated.  Thus  bare  template  matching  is  a 

tool  II  for  cutting  doun  the  number  of  interpretations  of  the  uord6 
in  the  fragment,  2)  for  making  ?  first  grammatical  analysis. 

For  example:  "Small  men  sometimes  father  big  sons"  wi i I  give  the  two 
sequences  of  heads; 

KIND  MAf  HOU  MAN  KIND  MAN 

and 

KIND  MAN  HOU  CAUSE  KIND  MAN. 

(CAUSE  is  the  head  of  the  verbal  sense  of  "father":  "to  father”  is 
analyzed  a*,  "to  ause  to  have  life".  I 

The  first  sequence  has  no  underlying  bare  template;  however,  in  the 
second  we  find  MAN-CAUSE-MAN  which  is  a  legitimate  bare  template. 
Thus  u®  have  disambiguated  "father".  At  the  same  time  it  proposes 
or, e  or  several  plausible  agent-action-object  substructures. 

However,  as  not  all  fragments?  follow  an  ac tor-ac t -ob j ec t  pattern  we 
have  extended  our  inventory  jf  bare  templates  as  follows*. 

ilwe  use  dummy  elements  as  place-holders  for  missing  items, 
0TH1S  for  the  actor  and  object  places,  and  DBE  in  the  act  place.  Thus 
TH I NG-OBE - 0 T H 1 S  and  MAN-M0VE-DTH1 3  are  legitimate  bare  templates. 

2)  we  consider  that  prepositions  carry  a  verbal  meaning;  thus 
they  are  coded  by  formulas  with  heads  POO  (for  "to",  "into",  "from" 
etc,,)  or  P9E  ("in",  "at"  .,.)  which  occupy  the  ..enter  place  in  the 
relevant  bare  templates.  This  yields  hare  templates 

such  as:  DTH 1 S-PBE-PO I  NT ,  DTH1S-PD0-THING  which  would  be  matched 
respectively  upon  phrases  like  "at  the  crossroad"  and  "out  of  the 


6 


.  ifiTTfii  ilium  i 


box"  (POINT  refers  to  point-like  entities  in  space  or  time). 


TEMPLATES 

The  process  just  described  has  selected  a  certain  number  of  formula 
triples,  which  us  will  refer  to  as  the  templates  for  the  fragment. 


EXPANSION 

The  expansion  algorithm  1)  carries  through  disambiguation  as  far  as 
the  context  of  u  fragment  permits;  2)  performs  the  work  of  a 
conventional  grammar;  namely  it  makes  explicit  linguistic 
dependencies  such  as  that  of  agt.it  on  act.  indirect  ,'bject  on  act, 
qualifier  on  substantive,  etc... 

Expansion  simply  means  taking  the  ona  or  more  templates  selected  by 
the  preceding  matching  process  in  the  context  of  the  fragment  from 
which  they  came.  and  looking  again  at  the  formulas  left  behind,  those 
wnich  did  not  get  picked  up  by  template  matching,  and  seeing  which  of 
them,  if  any,  can  be  attached  to  the  template  structure  jy  a  system 
of  dependencies  between  formulas.  By  "dependenc ‘ es" .  we  mear 
relations  such  as  eoont-act,  ac^-indirect  object,  qualifying 
ad  j  ec  t  i  ve  -  subs  t  an  t  i  vs,  etc.,  betueen  the  corresponding  formulas. 

Our  preference  principle  ells  us  tr#  sals1'  .  as  the  correct 
representation  for  a  fragment,  the  most  ex;  jnjed  or  densest 
temp  late: the  one  for  which  the  greatest  number  rf  such  dependencies 
can  be  set  up.  Th  s  met1  od  can  yield  virtually  all  the  .  esults  of  a 
conventional  grammar, uhi I e  using  only  relations  between  semantic 
e I emen  t  s. 

The  representation  derived  so  far  is  a  sequence  of  fragments  with, 
matched  unto  each,  one  or  several  expanded  templates.  In  addition, 
each  keyword  in  the  dictionary  is  coded  with  a  list  of  PARAPLATES 
(described  in  the  next  chapter)  which  have  been  carried  along  uith 
the  keyword  into  the  still  u  finished  representation.  This  is  what 
will  be  handed  on  as  input  for  the  second  phase  ot  analysis.  We  will 
now  describe  the  final  product  of  the  overall  analysis  process, 
leaving  aside  for  the  time  being  the  way  in  which  it  is  derived. 


THE  LINKS  AND  PINAL  FORM  OF  THE  IR. 

We  are  now  concerned  with  relationships  between  templates,  their 
definition  and  coding  .  To  each  expanded  template  is  attached  a  link. 
A  llhk  consists  of  three  I  terns  of  information  ;  the  KEY,  MARK  and 
CASE. 

The  key  is.  the  keyword,  if  any,  which  triggered  fragmentation;  else 
it  Is  NIL. 


7 


The  mark  Is  3  list  of  one  or  several  words  outside  the  current 
fragment,  each  of  which  rentes  to  the  current  fragment  through  the 
same  dependency.  The  catalogue  of  dependencies  considered  includes 
linguistic  relationships  such  ass 
subject  on  predicate 
governor  on  prepositional  phrase 
verb  On  object 

verb  of  main  clause  on  dependent  clause 
et . 

The  case  is  a  descriptive  tag  for  these  dependencies.  The  list  of 
case  nawis  includes:  AT  (location  in  space  or  time).  M!TH 

(instrument),  TO  (direction),  QUTOF  isour.-e),  08JE  (object),  etc... 

Here  is  an  example  of  an  English  sentence,  fragmented,  and  with  its 
koy,  mark,  case  and  matching  bare  template! 


fragment 

1  key 

1  mark 

I 

case 

I  template  1 

Some  people 

I  NIL 

NIL 

NIL 

iMAN-TH! NK-DTHI 3  | 

bet  i  eved 

1 

1  1 

and  said 

|  and 

(peop 1 e) 

PREO 

ju  fHI S-TELL-DTHI S | 

that  the  student 

1 

1  I 

ur.  i  b  i  ng  cou 1 d  have 

1 

1  I 

ied  the  country 

j  that 

( b  e  1  i  e  v  e  d 

said)  | 

08JE 

jACT-CAUSE-FOLK  | 

into  a  revolution 

|  into 

(led) 

1 _ 

TO 

|  OTH  J  5-POO-ACT 

The  IR,  in  its  final  form  consists  o?  a  sequence  of  fragments  of  the 
original  text,  with  matched  unto  each: 

-  one,  or  sometimes  several,  links. 

the  template,  or  triple  of  formulas,  on  which  the  bare 
template  was  matched. 

-three  "qualifier  lists"  wnich  are  lists  of  formulas 
containing  the  dependents  upon  the  agent,  act  and  object 
respect i ve I y. 


ADJUSTMENT  OF  THE  INTERLINCUA  TO  THE  TASK  OF  TRANSLATION, 

There  is  ^  class  of  discriminations  of  senses  of  a  word  which  any 
understanding  system  must  do:  thus  with  "rank"  in  "a  rank 
vegetation"  and  in  "close  the  ranks'.  Outside  those,  distinctions 
are  dictated  by  the  tusk  assigned  to  the  understanding  system.  Thus 
Winograd's  program,  whose  behavior  requirement  is  that  it  understands 
and  plana  the  execution  of  commands  concerning  the  man i pu I  a t i on  of 


8 


olocki,  distinguishes  two  senses  of  "on  to?  cf"s  e i t  ter  "directly  on 
the  surface",  or  "somewhere  above"  There  wtOd  be  10  point  in  making 
that  distinction  when  translating  into  French,  as  the  output  ignores 
it.  On  the  other  hand,  we  will  need  to  distinguish  between  "fish 
bones"  and  "mammals  or  birds  bones"  as  the  first  is  "arete"  and  the 
second  "os". 

In  fact,  an  English  word  has  as  many  semantic  formulas  attached  to  it 
as  it  has  renderings  into  French  according  to  context.  There  is  no 
limit  to  the  depth  of  embedding  of  formulas,  so  that  very  fine  sense 
discriminations  can  be  expressed,  and  the  analysis  algorithm  embodies 
a  powerful  d’t  jamb  i  gua  1 1  an  mechanism  whose  shortcomings  are  not 
related  to  the  fineness  of  discrimination.  Thus  we  could  translate 
"main tain"  by  "maintenir"  in  "maintain  order";  by  "entretenir"  in 
"maintain  relations";  and  by  "girder"  in  "maintain  one’s  cool".  The 
three  formulas  for  "maintain"  will  contain  as  category  of  preferred 
object:  respectively  a  type  of  arrangement  (GRAIN),  an  activity 
(ACT),  and  an  attitude  (STATE). 

A  semantic  category  can  perfectly  well  have  a  single  member,  which 
enables  us  to  handle  some  idioms. in  a  general  way.  For  example,  ona 
formula  for  "to  run"  is;  ( (HAN  SUBJ) C ( ACT  OBJE) ( (SELF  HOVE)  CAUSE))) 
where  the  preferred  object  subformula  is  that  of  "errand"  only;  the 
French  thvh  wants  "faire  une  course",  and  the  generation  patterns 
which  we  will  describe  below  are  written  to  produce  this  output, 

Another  example  of  a  sense  discrimination  performed  during  analysis 
is  "nearly".  In  "he  nearly  died",  it  becomes  a  verb  in  French:  "II  a 
fallli  mourir".  But  "it  is  nearly  morning"  givt-s  "c’  est  presque  le 
matin".  Thus  "nearly”  has  two  formulas:  one  indicates  an  adverb 
which  qualifies  actions,  and  the  other  an  adverb  qualifying  time 
entities.  The  analysis  ui  I  I  be  able  to  attach  "nearly"  to  t  tie  uord 
it  qualifies  and  generation  patterns  are  written  to  handle  the 
rephrasing. 


9 


CHAPTER  l!  I 


THE  Tit  ROUTINES 


THp  role  of  the  TIE  routines: 

1)  nnke  explicit  the  links  defined  in  the  last  sect. on  , 
namely  the  key-mark-case  triples  oir.ding  a  whoie  template  to  others. 

2l G i sambi guate  conten: -words  left  unresolved  af  1  »r  the 
expansion  process.  The  first  stage  of  analysis  j.@f,  only  tne  context 
of  a  fragment,  whereas  the  TIE  routines  will  consider  the  co"t?xt  of 
a  whole  sentence  or  more. 

3)  refer  pronouns  in  simple  cases.  There  is  no  easily 

defined  border  line  between  those  examples  which  require  the 
Inference  making  component  p  id  those  treated  in  the  TIE  routines. 
A  -ig  example  requiring  world  knowledge  that  not  coded  in  the 

formulas,  falls  into  the  former  category.  However,  the  example  "He 
drank  wine  out  of  a  glass  and  it  felt  warm  in  his  stomach"  '-equii  es 
extended  inferences  to  refer  the  "it",  although  it  uses  only 
information  contained  in  the  formulas.  For  more  details  see  Hi  Iks 
[2]  . 

4)  attach  a  generation  pattern  at  certain  points  in  the 
template  sequence. 

1 

To  carry  out  these  tasks,  we  need  a  process  analogous  bare 

template  matching  and  to  the  assessment  and  counting  c  dependencies 
m  the  first  phase  of  analysis:  but  for  keys  am:  their  context 
instead  of  content  words.  However,  we  have  adopted  a  different 
organisation;  the  reason  is  that  the  tasks  involved  require  complex 
and  varied  semantic  tas^s  to  be  made  on  the  context  of  a  key.  For 
example,  discriminating  oetween  the  senses  of  a  key,  not  only 
accoi  lirg  »o  case  but  also  according  to  French,  output  forms, 
necessitates  fine  and  variegated  semantic  tests.  A  key  has  thus  been 
coded  with  an  orrj*  sd  list  of  items  called  PARAPLATES,  whose  format 
is  versatile  and  can  include  any  eiesi-ea  semantic  predicate. 


PARAPLATES 
A  parapi ate  is: 

<  I  i  8 1  of  pl*edicates>  <case>  <s*sreotype> 

The  third  item  ic  a  generation  form  used  by  the  generation  routines 
and  described  >n  detail  in  the  next  section.  The  predicates  here 
assume  the  form  of  a  LISP  function  call  and  reft-  to  LISP  procedures. 
These  procedures  may  embody  any  kind  of  test  or,  the  interlingual 
context  of  the  key. 


10 


Do  fore  describing  hou  he  paraplates  are  used  at  a  procedural  level, 
let  us  consider,  as  xcimple,  three  consecutive  paraplates  ou*  pf 

the  list  of  parap  at.es  for  the  preposition  "in",  and  the  class  o- 
contexts  of  "in”  on  which  each  one  will  match: 

1)  (((OBJFCTJI  TH I  MG  >  (OB  JE  '  1  _K  CONT )  (MARK  _h  HOVE  (M0VF.  CAUSE )) (MATCHl 
UII  TH  GOAL  i  ) 

TO 

( (PREOD  OANS  ) ) 

2)  (M0BJECT_H  THING)  (*1ARK_H  HOVE  (HOVE  CAUSE))) 

TO 

t  UFHEOB  OAHc)  i  > 

3)  t ( (MATCH2_HEA0)  U1ARK_H  *00)) 

LUCA 

(  (PREDB  OANS) ) ) 

The  first  naraplate  will  match  the  sentence:  ”  i  put  the  key  /  in  the 
I ock"  . 

The  predicates  MARK-H  and  OBJECT-H  check  upo'  the  formulas  of  the 
mark  and  object  of  the  preposition.  In  the  fi'st  paraplate,  they  will 
be  true  iff  the  objact  of  ‘he  preposition  is  a  THING  and  if  the  mark 
is  a  movement  verb  (formula  with  head  MOVE  or  rightmost  subfortnula 
{'IQVl  CAUSE)).  The  predicate  06JECT_H  is  true  iff  the  object  of  ti.a 
preposition  contains  the  element  CONT,  i.e.  is  a  container. 

Let  us  assume  that,  in  our  dictionary  we  have  two  senses  of  "lock", 
one  for  lock  as  a  fastener,  the  other  for  the  Iock  in  a  canal.  Both 
locks  ar*1  things  satisfying  ((OBJECTJH  THING))  and  containers 
satisfying  ((OSJECTJT  CONT)),  Thus  th*  first  two  predicates  do  not 
allow  us  to  discriminate  between  these  two  senses.  For  this,  we  need 
I1ATCHI . 

Tho  predicate  MATCH!  considers  the  object  ("key")  of  the  mark  and  the 
object  of  the  preposition  ("lock")  and  >  s  true  if  their  formulas 
contain  an  ide'.ticai  subformula  with  a  rightmost  element  UITH  or 
GOAL.  This  turns  out  to  be. the  case  if  the  formulas  for  "keu"  and 
"•ock"  are  those  correspond i ng  to  the  senses  appropriate  to  the 
sentence;  these  formulas  express  the  fact  that  bo t f  corresponding 
objects  serve  the  same  purpose  (GOAL),  namely  "to  forbid  the  use  of 
an  opening"  (or  (((THRU  PAR T ) OBJE) NOTUSE ) ) C " ' 'SE )  as  it  appears  in  the 
formula) . 

The  predicate  MARK_M  tests  the  semantic  formulas  of  ■  prospective 
marks.  and  is  used  to  select  "put"  here  as  the  mark,  as  "put"  has 
been  coded  with  a  rightmost  subformula  (MOVE  CAUSE).  Simultaneously, 
the  directive  c3se  TO  and  the  generation  form  ( (fiPREOB  OANS)),  ("dans 
la  serrure1'),  are  selected. 


11 


Note  that  the  second  paraplate  will  fit  the  sentence  too.  However, 
the  ocher  of  parapiates  .  the  TIE  routine's  operation,  are  such 
that,  if  a  paraplate  higher  in  the  list  fits,  it  has  priority  over 
th»  ones  below.  For  this  to  be  effective  in  the  selection  among 
interpretations,  it  is  necessary  that  the  order  of  parapiates 

effects  a  degree  of  specificity  of  the  class  of  contexts  the 
paraplate  fits.  Thus  a  paraplate  higher  in  the  list  prescribes  the 
context  mote  tightly  than  one  below,  unless  they  are  mutually 
exclusive.  This  is  equivalent  to  saying  that  more  "dependencies"  are 
ascertained  by  a  higher  paraplate,  so  that  it  is  naturally  preferred. 

Consider  now  tf  a  sentence:  "He  put  the  number  /  in  the  table". 

There,  only  the  third  paraplate  will  f '■  t,  s  i  mu  1 1  anect  s  I  y  selecting 

the  numerics!  sort  of  table  and  not  the  flat  uoden  on®.  The 

predicate  MATCH2JIEA0  considers  the  heads  of  *he  formulas  for 
"number"  ami  "table"  and  is  true  if  they  are  the  same,  which  is  true 
only  fnr  the  correct  sense  of  "table"  (both  heads  being  SIGN). 

Finally,  the  sentence  "I  put  the  book  /  in  the  table'  nil,  fit  both 
paraplate  2  and  3,  giving  the  same  sense  of  table  in  both  cases,  that 
of  a  flat  surfaced  object,  but  paraplate  2  will  be  preferred. 

In  add:tion  to  d I s smhi guat i 5 ng.  a  fitting  paraplate  will  yield  a 
case,  a  mark  and  a  adequate  generation  pattern. 


PROGRAM  OPERATION 

•Let  us  first  assume  that  no  ambiguity  has  oeen  left  over  from  the 
i  ntra fragment  analysis  process,  so  that  to  each  fragment  is  attached 
one  expands*  template  and  one  only. 

The  corn  of  the  TIE  routines  consist  of  a  set  of  rules  uritten  .  n  DNF 
form  representing  the  sequences  of  keys  and  template  types 
c or r espon  1 1  na  tc  normal  English  sentences*  assuming  only  one  expanded 
template  per-  fragment.  There  are  S  types  of  templates  corresponding 
to  the  permitted  comb  i  r  rit  i  ons  ov  dummy  elements  in  the  template;  the 
class  of  templates  with  ,ne  dummy  element  in  the  subject  position  is 
subdivided  into  p"epc  itionai-  and  vei  ba I -ac t i on  templates.  Whan 
those  rules  are  used  to  "parse"  the  semantic  representation.  the 
relations  between  fragments  appear,  making  it  possible  to  assign  mark 
anj  case,  provided  that  the  semantic  information  held  in  the  key 
parapiates  is  simultaneously  taken  into  account.  This  is  done  by 
"executing"  the  parapiates  of  the  key  in  the  course  of  the  "parsing", 
when  there  are  any. 

Uhen  this  operation  is  completed,  a  density  coefficient  is  computed. 
This  coefficient  accounts  for  dependenc.es  between  templates  such  as 
agent-act,  an teceden t-rs I  a 1 1 ve  clause,  etc...  :  for  prepositional 
phrases,  the  higher  in  the  list  the  selected  paraplate,  the  greater 
is  the  density  increase.  This  density  is  used  in  disambiguating 
con t en t -words  as  follows:  formulas  for  the  ambiguous  words  are 

12 


en  t  erect 
■'pars  <  ng" 
"parsing" 


in  tut  n  in  the  interlingual  fragment;  each  time,  the  above 
is  attempted.  The  set  of  formulas  yielding  the  densest 
gets  selected,  together  with  its  links  and  stereotypes. 


REFERRING  PRONOUNS 

Tuo  processes  are  used  to  refer  pronouns:  one  uses  only  the  context 
of  the  fragment  containing  the  pronoun  to  choose  among  possible 
referents,  the  other  uses  the  context  of  a  whole  sentence  or  more. 

The  first  procedure  works  as  follows:  the  program  collects 
syntactically  plausible  referents  and  makes  a  first  selection  using 
the  following  observation:  substantives  depending  upon  the  same 
action  through  various  case  relationships  either  cannot  refer  to  tf  a 
same  object,  arid  this  is  a  semantic  impossibility,  {thus  the 
direction  of  an  action  (movement)  cannot  be  its  subject)  or  else  a 
reflexive  pronoun  is  used  (  "He  has  dedicated  the  book  to  himself"). 

The  set  of  referent  c£  Jidates  is  then  ordered  according  to  a 
priority  based  on  syntactical  obs.rvations  such  as  :  the  function  of 
a  pronoun  in  it?  context  is  often  the  same  as  that  of  its  referent  in 
its  own  context.  Thus  in  "John  offered  a  present  to  Peter  because  he 
liked  him",  "he"  is  actually  refers  to  "John"  and  not  to  "Peter". 
Finally  the  formulas  of  the  candidates  are  substituted  in  turn  for 
the  pi  onoun  inside  the  template  and  for  each  the  density  of 
dependencies  is  computed  as  during  the  expansion  process.  The 
formula  giving  the  highest  density  or,  if  there  are  several  of  those, 
the  one  among  them  with  highest  priority  ;s  selected. 

The  second  process  is  similar  to  the  resolution  of  content-uord 
ambiguity  by  the  TIE  rou'ines;  i.e.  possible  referents  are 
substituted  in  turn  in  the  pronoun  place,  the  parsing  is  done  and  the 
highest  density  causing  points  to  a  preferred  referent. 

As  we  have  seen  in  the  Introduction,  these  two  orocesses  will'  not 
resolve  all  anaphoric  reference  problems.  The  extended  inference  mode 
CNilko  I2J I  will  then  hand*e  remaining  ambiguities. 


13 


CHAPTER  IV 


THE  GENERATION  ROUTINES 


Translating  into  French  requires  the  addition  of  generation  patterns 
called  STEREOTYPES.  Those  patterns  are  attached  to  Eng!  i  si  words  in 
the  dictionary,  both  to  key9  and  content  words,  and  carried  into  the 
IR  by  the  analysis. 

A  content  word  has  a  list  of  stereotypes  attached  to  each  of  its 
formulas.  When  a  word-sense  is  selected  during  analysis,  this 
list  is  carried  along  with  the  formula  inside  the  IR.  Thus,  for 
translation  purposes,  the  IR  is  not  made  out  simply  of  formulas  but 
of  SENSE-PAIRS.  A  sense-pair  is  : 

<formula:  <list  of  stereotypes> 

As  for  keys,  we  have  seen  in  the  last  section  that  each  key  paraplate 
contains  a  stereotype,  which  gets  attached  to  the  template  if  the 
corresponding  paraplate  has  been  selected  by  the  TiE  routines.  This 
stereotype  is  the  generation  rule  to.  be  used  for  the  current  fragment 
and  possibly  some  of  its  sequents. 


STEREOTYPES 

The  simplest  form  of  a  stereotype  is  a  French  word  or  phrase  standing 
for  the  translation  of  the  English  word  in  the  context.  With  the 
nouns  is  a  gender  marker.  For  example: 

private  (a  soldier)  :  (liASC  simple  soldat) 

odd  (for  a  number!  :  (impair) 

bui In  :  (construire) 

brandy  :  (^EHI  eau  de  vie) 

Note  that  after  processing  by  the  analysis  routines,  all  words  are 
already  disambiguated.  Several  stereotypes  attached  to  a  formula  do 
not  correspond  to  different  senses  of  the  source  word,  but  to  the 
different  French  constructions  it  can  yield. 

Complex  stereotypes  are  strings  of  Frerch  words  and  functions.  The 
functions  are  functions  of  the  interlingual  context  of  the  sense-pair 
■3nd  evaluate  to  a  string  of  French  words,  a  blank,  or  to  NIL. 
I.e,  such  stereotypes  are  CONTEXT-SENSITIVE  RULES  which  check 

upon,  and  generate  from,  the  sense-pair  and  its  context,  and  this 
means  other  'ragments  as  well  as  the  current  one. 

Uher,  a  fu’„  ion  in  a  content  word  stereotype  evaluates  to  NIL.  then 
the  whole  s.ereotype  fails  and  tb”  next  one  m  the  list  is  tried. 


14 


For  example,  here  are  the  tuq  stereotypes  adjoined  to  the  ordinary 
sense  of  " advise"; 


(consei I ier  IPREOB  a  flAN)  ) 


(conse i  I  ter) 

The  first  stereotype  ucuid  be  *or  translating  ”1  advised  my  children 
to  leavp".  Th«  analysis  routines  Mould  f ave  matched  the  bare  template 
II  AN  -  T  ELI  -HAN  on  the  uord  triple  1  -  adv  i  r-ed-ch  i  I  dren ,  The  function 
PREOO  looks  at  whether  the  object  formula  of  the  template,  i.e.the 
one  for  "children”  in  our  example,  refers  to  a  human  being;  if  so  it 
generates  a  prepositional  group  with  the  French  preposition  "a", 
using  the  object  sense-pair  and  its  qualifier  list.  Here  this  yields 
"a  raes  enfants"  ,  and  the  value  of  the  uih.de  stereotype  is 
" conse i tier  a  me?  enfants". 

For  the  senter.re  " !  advise,  patience”,  whose  translation  might  be  "je 
conseitle  la  patience",  this  stereotype  would  fail,  as  the  object 
head  is  STATE.  The  second  is  simply  " (consei I  I er ) " ,  because  no 
prescription  on  how  to  translate  the  object  needs  to  be  attached  to 
"consei  I  ler"  when  the  semantic  object  goes  into  a  Frtnch  direct 
object,  as  this  is  done  automatically  by  the  higher  level  function 
which  constructs  French  clauses. 

Thus  we  see  that  content  words  hove  complex  stereotypes  prescribing 
the  translation  of  their  context,  when  they  govern  an  "irregular" 
construction,  that  is  irregular  by  comparison  to  a  set  of  rules 
matching  the  French  syntax  on  the  1R. 

The  stereotype  for  a  content  word  can  prescribe  the  translation  of 
fragments  other  than  the  one  in  which  it  is  included,  A  generation 
rule  for  a  fragment  usually  comes  from  some  key  parapiate.  A  list  of 
key  paraplates  reflects  the  fact  that  rules  of  syntax  are  usually 
based  on  some  semantic  classification;  i.e,  for  given  semantic 
categories  and  relationships  in  the  context  of  the  key,  the  output 
syntax  is  represented  by  the  adjoined  stereotype.  However,  in  any 
natural  language  there  will  be  exceptions  to  any  classification 
scheme.  Exceptions  are  dealt  with  here  by  attaching  the 
rephicament  generation  *“ u  1  e  to  tbs  word  governing  the  construction 
(usually  the  mark  of  the  fragment). 

For  example,  the  paraplates  for  "to"  as  in  "John  told  him  /  to 
leave",  state  that  if  the  mark  is  an  act  of  verbal  communication 
(formula  head  TELL),  then  the  "to"  phrase  should  be  translated  by 
"cle"  follcued  by  an  infinitive;  "John  iui  a  dit  de  partir".  This  is 
generally  the  case;  houever  "to  urge",  when  going  into  "exhorter", 
lias  been  coded  with  a  FELL  head,  but  gives  the  construction  "a 
partir".  Tnus  one  of  its  stereotypes  indicates  that  the  construction 
following  "exhorter"  must  be  "a  partir",  while  the  funrtion 
supervising  the  execution  of  stereotypes  ensures  tr-,,'*  "a  partir"  will 
supersede  "de  partir",  the  construction  which  the  key  sterotype 


IB 


attached  to  the  template  by  TIE  woult-  have  generated.  This  stereotype 
is  as  f n I  I  ous: 

(exhorter  (Di ROB  MAN)  (FIND-LINK  GOAL  IR-VP)  a  (INFV^)J 


which  would  apply  in  the  example! 


fragment  / 

/  bare  template 


I  key  ! 

I  I 


mark 


case  |  stereotype 


I 


I 


I 


The  delegate  urged  the  women  I 


NIL 


NIL 


|  NIL  I  ( ( I NDCL )  )  I 


MAN 


TELL 


who  were  striking 
MAN  NOTUO 


MAM  |  I  j  J 

- |  -^o*  |  Tworker s7  |  “SPEC**  |  ("TuHCLTF 

□THIS  I  i  ! 


I 

_ I 


I 


to  be  patient 
QTHIS  BE  KINO 


to  |  (urged)  I  GOAL  |  (de  (INFVP) )  | 


.1 , 


In  the  stereotype  above,  DIROB  constructs  a  direct  object  with  the 
template  object  if  it  is  a  human  being, 

FtND-UNK  takes  as  arguments  a  case,  and  a  descriptor  of  template 
tunes  here  IR-VP.  which  indicates  !ue  set  of  templates  with  a  dummy 
-c  t  I  t  searches  the  Interlingua  down  from  where  urged 

occur  s  ‘  for  a  fragment  with  case  and  template  type  according  to  the 
arguments,  and  with' this  occurence  of  “urged"  itself  as  3  .ark.  The 
t  h?  r  d  fraument  in  our  example  fulfills  these  cond  i  1 1  ons.  T,,e  - on  * 
function  'supervising  the  evaluation  ot  stereotype  starts  then 
asperating  from  it.  using  the  piece  of  stereotypes  which^foMows 
F1N0-LINK,  i.e.  "3  1INFVPI*  instead  o«  the  stereotype  of  to  uh.ch 
had  been  selected  during  TIE  (namely  de  (INFVP)  ). 


tNFVP  generates  an  infinitive  verb 
implicit  subject  (here  women)  from  the 
communication  involving  an  attempt 
such  as  :  persuade,  order,  advise 


-phras * ,  after  inferring  its 
semar.*  ics.  Acts  of  verbal 
to  influence  the  interlocutor, 
contain  a  rightmost 


-ubformuta  (FORCE  TELL)  and  the  subject  o-  the  dependant 
U  their  object.  The  knowledge  of  the  implicit  subject  is 
to  proper  agreement  in  French,  Thus  the  translation  of 
horc  in;  "3  tre  patientes”  where  "putientes  agrees 

f  ernmn  s" . 


'  to"  phrase 
necessary 
the  phrase 
wit'’  "  I  es 


THE  GENERATION  PROCEDURE 


'he  usneral  form  of  the  generation  program  is  a 
of  the  functions  contained  in  stereotypes.  ,h‘,s- 
context  of  occurrence,  a  p«  ticular  word  of 
sen  I  err  s  may  have  its  origin  in  stereotypes  o 
content  fn’"d  stereotype,  key  word  stereotype  or 


recursive  evaluation 
depend i ng  on  >  t s 
the  French  output 
different  levels! 
stereotypes  that  are 


IS 


Dart  of  a  so  of  top  level  basic  functions. 

Key  stereutypes  contain  top  level  functions  which  uill  generatt 
French  clauses  and  prepositional  phrases,  using  the  template  to  which 
the  stereotype  is  attached  and  possibly  some  of  its  sequents.  Th> 
most  frequently  encountered  functions  are: 

(PREOB  <French  preposi t i on>) 

This  will  generate  a  prepos i t i ona I  group,  uring  for  the  object  the 
stereotypes  attached  to  the  object  formula  of  the  template.  it  calls 
the  basic  function  NOUN-GROUP,  whic*"  uses  a  sense-pair  and  a  list  of 
qualifying  sense-pairs  to  generate  a  French  nominal  group. 

(INOCL) 

Generates  a  French  clause  in  the  indicative  mood,  a 
agent-action-object  triple  in  the  IR,  Given  the  process  of 
fragmenting  by  key-word,  these  three  elements  are  sometimes  in 
different  fragments  and  then  the  mark  and  case  make  explicit  their 
relationships  (the  cases  used  are  PRED  (predicate)  and  OBJE 
(object)).  INOCL  calls  the  basic  function  CLAUSE-GROUP. 

To  describe  the  operation  of  CLAUSE-GROUP  and  NOUN-GROUP,  it  is 
necessary  to  introduce  the  two  functions  which  hand  I  ;  stereotypes. 

SNAP  takes  a  stereotype  as  argument.  It  goes  down  the  its  string, 
building  a  French  string  in  the  process,  by  concatenating  the  French 
words  and  *.he  result  of  evaluating  the  functions.  It  stops  and 
returns  NIL  whenever  one  of  these  functions  returns  NIL;  otherwise  it 
returns  the  French  string  constructed.  IMAP  has  also  a  feature, 
described  below,  which  permits  the  reordering  of  stereotype  strings. 

SSELECT  takes  as  argument  a  list  of  stereotypes  and  applies  SNAP  to 
each  of  its  members  in  turn,  until  SNAP  -eturns  a  non-NlL  value. 

The  bodies  of  the  two  main  syntactical  functions  CLAUSE-GROUP  and 
NOUN-GROUP  consist  of  the  application  of  8SELECT  to  a  list  of 
stereotypes  which  reads  somewhat  tike  the  phrase  structure  rules  of 
the  t or r espond i ng  French  syntactical  constituent.  The  bottom  level 
functions  call  recursively  SSELECT  to  work  on  the  list  of  stereotypes 
of  a  given  content  word  and  operate  transformations  on  its  output  for 
proper  concord,  agreement,  etc...  To  that  effect.  special  variables 
carry  along  information  about  gender,  number,  person  etc... 

In  fact  eact  function  in  a  stereotype  calls  SSELECT  to  work  on  a  I  st 
of  other  stereotypes  so  that  the  sequence  o*  SSELECT  calls  during 
execution  follows  the  underlying  tree  struc  ore  of  the  constituent. 
French  words  found  in  stereotypes  correspond  tc  the  terminal  nodes. 
Generation  proceeds  from  left  to  right.  Concatenation  to  the  right  is 
done  by  MAPS. 


17 


However  some  complexity  arises  from  the  fragmented  structure  of  the 
IR,  and  with  the  problem  of  integrating  complex  -  context-sensitive 

stereo  types. 

Translating  fragment  by  fragment  and  preserving  the  interlingual 
order  of  fragments  is  inadequate  as  exemplified  by: 

John  said  a  word  /  to  him. 

-t  Jean  lui  dit  '.in  mot. 


and: 


/  was  told  /  to  leave. 


the  man  /  wi th  b lue  eyes 

dit  a  rhomme^aux  yeux  hleus  de  partir 


on 


the  generation  rules 
to  pick  stereotypes 


Thus , 
care 

output  translation,  moving 

if  necessary,  Uhil0  evaluating  stereotypes, 
cursor  i  .ich  points  to  the  fragment  which  is 
The  pc  pose  of  certain  functions  in  ^  - 

above)  .s  to  move  the  cursor  up  and  down  in  the  IR. 


take 

rect 


of  CLAUSE-GROUP  and  NOUN-GROUP  mus< 
in  the  IT.  in  an  order  ensuring  a  co 
from  template  to  template  in  the  process 

the  program  maintains  a 
being  generated  from, 
stereotypes  {such  as  FIND-LINK 


Inst  ting  complex  stereotypes  in  the  procedure  poses  tuo  problems: 
first,  when  evaluated  in  certain  contexts,  a  stereotype  string  has  to 
he  reordered.  Consider: 

I  often  urged  him  to  leave,  -*  Je  I’ai  souvent  exhorte  a 

par  t  i  r  , 

The  stereotupe  of  "urge’’  applicable  here  is: 

texhorter  ( 0 1  ROB  MAN)  (FiNO-LINK  GOAL  IR-VP)  a  (INFVP); 


The  value  of  the  D!  ROB,  namely  "  l  mu-t  precede  "ai  exhort  e1’  and  the 
adverb  “ souven t"  'mus t  be  inserted  between  the  auxiliary  at  and 
"exhorte".  To  accomplish  this,  IMAP  allows  for  the  values  of 
designated  functions  in  a  stereotype  to  be  lifted  from  it  and  r  3>  ed. 
Then'  a  new  string  can  be  fort*,eJ  by  concatenating  the  stored  values 
with  the  va I uee  of  any  other  function  if  desired,  in  order  to  pr educe 
the  desired  o*  put. 

Second.  we  ieed  the  implement  ion  of  a  system  of  priorities  for 
reciu  I  at  i  ny  the  choice  of  generation  rules.  Since  any  word  or  key  can 
dictate  the  output  syntax  for  a  given  piece  of  IR,  there  may _ arise 
conflicts.  hich  are  resolved  by  hav’ng  carefully  settled  priorities 
The  general  idea  is  that  a  men  e  specific  rule  has  priority  over  a 
more  general  one. 

Thus  when  a  content  word  stereotype  (normally  more  specific) 
prescribes  the  translation  of  fragments  other  than  its  immediate 
context,  it  has  priority  over  any  keu  stereotype  (normally  more 


general).  As  we  have  seen,  in  the  example  “The  delegate  urged  the 
women...".  gei.°'.  at  i on  will  proceed  *rom  the  stereotype  o'  "urge"  arid 
ignore  the  stereotype  (de  IINFVP))  at -ached  to  the  third  fragment  Hu 
the  TIE  routines. 

CL AIJSE-OROUP  has  a  general  rule  for  the  object  of  an  action  namely 
concatenate  the  value  of  NOUN-GROUP  applied  to  it.  However  mis  is 
overruled  whenever  the  action  stereotype  dictates  a  different 
handling  of  the  object. 

A  runetion  REPHRASE  allows  us  complex  r ephr as i ngs ,  surh  as  the 
fw  idling  example:  "John  nearly  Killed  himself",  uhich  translates 
prnpnily  into  "John  a  failli  se  tuer",  i.  e.  the  adverb  "nearly" 
goes  into  the  verb  "fail  Mr".  "N  arly"  has  the  following 
stereotype: 

(  (REFNL'ASE  VERB-GROUP  (  IVERQ  -GROUf-  FAILLIR)  ( INFVC-0)  )  ) 

The  function  REPHRASE  indicates  that  the  execution  o’  the  function 
VERB-GROUP  -  a  constituent  in  CLAUSE-GROUP  -  should  .'e  replaced  by 
the  evaluation  of  the  stereotype  which  is  its  second  ujument.  This 
will  generate  a  verb-group  constructed  from  "'a'lNir",  allowed  by  an 
infinitive  verb-group  with  the  "current"  subject  (that  if  “failSir") 
as  its  own  subject.  Any  stereotype  frjm  a  REPHRASE  call  takes 
precedence  ov-r  whatever  stereotyres  the  substituted  function 
contained. 

Implementation  of  these  priorities  requires  some  functions  in  the 
stereotypes  to  test  other  stereotypes  in  advance  in  order  to  decide 
what  to  generate  next.  And  the  overall  control  function  does  some 
book-keeping;  i.  e.  it  keeps  track  of  uhich  sense-pair  and  fragments 
have  already  been  generated  from,  and  which  stereotype  it  used. 

Tre  overall  control  function  sets  the  cursor  to  the  first  fragment 
anu  picks  up  its  stereotype;  SNAP  is  run  though  it,  and  tho  cursor 
moves  or  down  in  the  IR  as  the  recursive  structure  calls  for. 
Uhen  SNAP  pops  up,  after  exhaustion  of  the  first  stereotype,  the 
French  phrase  that  is  its  value  is  concatenated  to  the  text  already 
generated.  The  program  then  moves  down  into  the  IR  until  it  finds  a 
fragment  which  has  not  been  translated  yet;  the  process  is  then 
reiterated  as  with  the  first  f.agment. 

The  generation  procedu.  e  is  formally  equivalent  to  ai  augmented 
recursive  transition  network  fUoods  [?:).  Functions  in  Mereotypes 
correspond  to  the  syntactical  constituents  on  the  arcs.  A  list  of 
-stereotypes  as  an  argument  for  JEVAL  corresponds  to  several  arcs 
leaving  from  a  given  state.  Stereotypes  may  include  predicates  which 
play  the  role  of  Uoods’  tests:  the  result  of  their  evaluat'-on 
determine  whether  an  arc  will  be  followed  or  not.  Hoods'  registers 
take  the  form  of  LISP  PROG  variables,  which  function  as  pusl  down 
stacks  and  and  hold  pieces  of  generated  text  or  any  desired 
i nf ormat i on. 


J  'I  t  n  in*  in' Tii  mii  i 


19 


References 


»:!ks,  V.:  "Preference  Semantics",  Stanford  A. I .  Project  Memo  0286, 
1973.*To  appear  in  (ed. Keenan)  The  Formal  Semantics  of  Natural 
Language.  Cambridge  U.P. 

U ; |ks,  Y.  t  "Natural  ianguage  inference",  Stanford  A. 1.  Project  Memo 
0211,  1973. 

Woods,  W.A.:  "Augmented  transition  networks  for  natural  ianguage 
analysis".  Report  it  CS-1,  Aiken  Computation  Laboratory,  Harvard 
University,  Dec.,19S9. 


29 


